Skip to main content

Reply to this post | See parent post | Go Back
View Post [edit]

Poster: Branko Collin Date: Feb 22, 2005 7:56am
Forum: texts Subject: Re: Ideas to help proofreading?

Distributed Proofreaders have developed a number of tools to help us proofread. Although they are likely to be only really useful to DP (the earliest of those started out as tools to check if a text conformed to PG's formatting guidelines), perhaps one of your programmers could look at their source and see if any general rules can be gleaned from them.

Especially useful may be our pre-processing tools, as they try and catch some of the commonest problems.

We search for scanning errors using spell checkers, and for the ones that are valid English words using lists of "stealth scannos". For instance, "and" is commonly mis-OCR-ed as "arid". (Similar lists exists for LOTE.) This method would probably be too time-consuming for TIA, but you could construct a tool that will find spelling errors that are commonly produced by OCR software. "tbe" for "the", for instance.

We have also several anecdotes about how something would appear an error to anyone but a human proofreader: how far you want to take things with automation also depends on how many errors you want to introduce.

If you need one of your interns to actually look at a text, our special proofing font may help; it's ugly as sin, but helps errors really stand out.

Not all of our tools are available through Sourceforge; our Help pages link to them, though.

When you start working with volunteers, try and make it as easy as possible for them to contribute.